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Problem area location in an image signal. 



PCT/EP99/05771 



The invention relates to problem area location in an image signal, and more 
specifically, to occlusion detection and halo reduction in motion compensated pictures. 

Every motion compensated scan-rate conversion method is confronted with the 
5 problem of occlusions in a sequence. Several approaches have been attempted to cope with it. 
In many cases the effort has been devoted at improving the quality of the motion estimation 
method in order to have very precise motion boundaries (e.g. [1]). But in the regions where 
covering or uncovering occurs, and where the motion estimation is performed by analyzing 
two successive frames, motion estimation is an ill-posed problem [2] and cannot yield good 

10 results. To overcome this problem many authors propose to use three frames [3] [4] [5] or four 
frames [6], for both motion estimation and motion compensation. When architectural 
constraints suggest to use two frames only, an ad hoc interpolation strategy has to be 
introduced. This strategy can be applied on every pixel of the image or can be preceded by the 
localization of critical areas, i.e. by a segmentation of the image. 

15 In [7] a method was disclosed for motion compensated picture signal 

interpolation that reduces the negative effect of covering and uncovering on the quality of 
interpolated images. In the described case, that applies an order statistical filter in the up- 
conversion to replace the common MC-averaging, interpolated pictures result from pixels 
taken from both adjacent fields. 

20 In [2] and in [8] a segmentation for the same purpose was described. This 

segmentation is based on a motion detector, and can only produce reliable results if covering 
and uncovering occur of stationary backgrounds. 

In [9] a method was disclosed that allows a reduction of halo defects in 
architectures that enable access to one field only, or in systems particularly designed to have 

25 access to one field only in order, to obtain the increased resolution of an interpolation 
according to [10]. 

In [11] a method was disclosed that uses two motion estimators, a causal 
motion estimator (that predicts the future from the past) and an anti-causal motion estimator 
(that predicts the past from the future). Depending on which one of the two estimators gives 
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the 'best match 1 the area is classified as covered or uncovered, and the corresponding 
luminance value is taken from the previous or the next field. 

In [12] the interpolation strategy is tuned depending on the 'difficulties' of the 
image part. It combines several of the well-known algorithms for motion compensation, 
5 aiming at exploiting their complementary strengths. The task of selecting the appropriate 

algorithm is assigned to an Ordered Statistical filter. Where no adequate strategy is available, 
like in covered/uncovered areas, it aims at softening the resulting artifacts. 

Instead, in [13] it is stated that the general rule for an effective interpolation 
seems to be: "if it is not possible to shift a small detail correctly because of faulty motion 
10 vectors, better suppress it than smooth it". This is achieved, when there is a faulty vector 
assigned and a correlated picture content, extending the median mask used to filter the 
candidates from the neighboring frames, and where there is no correlated picture content, 
using the probability distribution function of a Centered Median Filter, to select the 
candidates. 

15 

It is, inter alia, an object of the invention to provide a straightforward and 
reliable occlusion detection and halo reduction. To this end, a first aspect of the invention 
provides a problem area location method and device as defined by claims 1 and 8. A second 
aspect of the invention provides a corresponding image interpolation method and device as 

20 defined by claims 5 and 9. A third aspect of the invention provides an image display apparatus 
as defined by claim 10. Advantageous embodiments are defined in the dependent claims. 

In a method of locating problem areas in an image signal, a motion vector field 
is estimated for the image signal, and edges are detected in the motion vector field. In a 
corresponding method of interpolating images between existing images, image parts are 

25 interpolated in dependence upon a presence of edges; preferably, an order statistical filtering is 
used at edges. 

The current invention basically adapts the interpolation strategy depending on a 
segmentation of the image in various areas. Contrary to [2, 8], the current invention aims to be 
valid even if both foreground and background are moving. 
30 These and other aspects of the invention will be apparent from and elucidated 

with reference to the embodiments described hereinafter. 



In the drawings: 

Fig. 1 illustrates the basic recognition on which the present invention is based; 
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Figs. 2 and 3 illustrate covering and uncovering; 

Fig. 4 shows a preferred embodiment of an image display apparatus in 
accordance with the present invention; and 

Fig. 5 shows a region detector for use in the embodiment of Fig. 4. 

5 

Our method aims first at localizing, in a robust and cost-effective way, the areas 
where vector based algorithms for scan rate conversion can produce very strong and annoying 
artifacts. For those areas several solutions are proposed, depending on the target quality of the 
up-conversion and on the cost constraints. The usefulness of this approach shall be proven in a 
10 comparison with the aforementioned alternatives, although their benchmarking has not yet 
been completed. 

In order to detect areas in which covering or uncovering occur, the current 
algorithm just needs the information that is available in a motion vector field related to that 
frame, and a very limited processing of that information. In fact the motion vector field 

15 already describes the temporal behavior of the sequence, generally obtained using more than 
one frame, thus no additional information is needed for only covering/uncovering detection. 

The current algorithm does not need an ad hoc motion estimation, provided that 
the motion vector field used is intended to supply the true motion within the sequence. The 
first step of the algorithm consists in detecting significant discontinuities in the given motion 

20 vector field, assuming that these correspond to the borders of moving objects. 

Fig. 1 shows a vertical line and two arrows at opposite sides of the vertical line. 
Assuming that the vertical line is an edge in the motion vector field, and the arrows represent 
the motion vectors at the two sides of the edge, by analyzing the vectors on both sides of the 
edges we can conclude that there is covering when the longest vector points towards the edge, 

25 or, when the two vectors have the same length, when both the vectors point towards the edge. 
Similarly there is uncovering when the longest vector points in the direction opposite to the 
edge, or if the two vectors have the same length, when they both point in directions opposite to 
the edge. The three pictures at the left-hand side of Fig. 1 show covering, while the three 
pictures at the right-hand side of Fig. 1 show uncovering. From an analysis of Fig. 1, it is also 

30 possible to determine the width of the covered or uncovered area. 

In a more formal way, let D(X, n) be the displacement vector assigned to the 
center X = (Xx, Xy) T of a block of pixels B(X) in the current field n, we check the vector 
difference of the displacement vectors D1(X-K, n) and Dr(X+K, n) where K (k, 0) T and k is a 



20 
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constant. These motion vectors are those assigned to blocks situated on, respectively, the left 
and the right hand side of every block B(X) in the current field n. 

In a first approach we have taken only horizontal edges into consideration, 
because they occur most frequently in sequences. Extending the algorithm in order to consider 
edges in every direction is straightforward. When the absolute differences for both x and y 
components are higher than a threshold value thre: 



X Dl(X-K.n) X Dr(X+K,n) 



>thre 



(1) 



y5l(X+K,n) " y Dr(X + K,n) 



> thre 



(2) 



10 we decide that there is a significant edge within the block centered in X = (Xx, Xy). 

Of course all the neighboring blocks of a block in which an edge has been 
detected have to be considered blocks in which covering or uncovering can occur. They will 
undergo the same procedure as those in which an edge has been directly detected. If edges 
have been located, we can use the vector difference between DI and Dr, to decide upon 

15 covering or uncovering. Considering positive the sign of a vector pointing from right to left, in 
case: 



Dl(X -K,n)~ Dr(X + n) > 0 



(3) 



there will be uncovering, whereas: 



Dl{X -K,n)~ Dr(X + K,n)<0 



(4) 



indicates covering. 

Moreover we are able to determine the covering or uncovering width c/u W i d th 
and height c/u he i g ht of the area that has been covered or uncovered between the previous and 
25 the current fields. These are related to the absolute difference for the x and the y components 
of the vector difference: 



C I U width 



X Dl(X + K,n) X Dr(X + K,n) 



>thre (5) 



WO 00/11863 



PCT/EP99/05771 



clu 



height 



y5l(X + k,n) y Dr(X + K.n) 



>thre 



(6) 



In order to know which of the two vectors belongs to the background and which 
one to the foreground, and thus where the foreground and the background are, we have to 

5 consider that the edges move with the foreground velocity. Comparing two successive motion 
vector fields, i.e. the location of the edges in these two fields, it is possible to say with which 
of the two velocities the edges move. The velocity with which an edge moves will be the 
foreground velocity, and the part of the image that is interested by that velocity will be the 
foreground. On the opposite side of the edge there will be the background, with associated the 

10 background velocity. 

The resolution of the edges localization such as we have described it till here, is 
not finer than the, say, 8X8 pixels block size used by the motion vector estimator. A less 
coarse method is preferred. In order to improve on that, i.e. in order to have a better 

15 localization of 'real covered/uncovered' areas at the interpolated position, and an accurate 
choice of the vectors to be used in the up-conversion, we have developed two methods, that 
exploit the information gathered till here. What is intended for 'real covered/uncovered' areas 
is shown in Fig. 2 and in Fig. 3, where it is illustrated how, at the temporal position of the 
frame to be interpolated, only a portion of the area I (HI) that has detected as covered 

20 (uncovered) from frame N to frame N+l is actually covered, la (uncovered, IIIB). 

The first refinement method developed is similar to what has been described in 
a previous patent application [14]. It makes use of two different match errors calculated for the 
same 2X2 block at the wanted interpolated position, using two different candidate vectors. In 
this application they would be the vector on the left-hand side of the edge, and the vector on 

25 the right-hand side of the edge. If at least one of the two errors is smaller than a pre-defined 
threshold, we assume that the block we are dealing with is belonging to the foreground i.e. it 
does not belong to a 'really covered/uncovered' area. In this case, the vector, among the two 
tested, that gives the least error is the vector chosen to interpolate the foreground. The other 
one should describe the displacement of the background in the neighboring blocks. 

30 The second method, that will be described hereafter, also makes use of the fact 

that in 'real covering/uncovering' areas the match error at the interpolated position is high 
whatever vector is used, since, as we know, no good match is possible here. This second 
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method only needs one match error and will look for the gradient of it along a line, from left to 
right (or vice versa), in every portion of the frame in which covering or uncovering has been 
detected. 

Using one of the two vectors on the sides of the edge, we calculated, in fact, on 
every 2X2 block belonging to the 8X8 block that we know being interested by either covering 
or uncovering, the SAD error e at the temporal position of the frame to be interpolated 



e(D,X,n + tpos)= J \F(x-tposD,ri)-F(x + Q.-tpos)D,n+l) 



(7) 



Xeb'(X) 



We assume that this error will have a sudden increase as soon as the area 
10 considered is belonging to the 'real covered/uncovered' areas. The edge of the 

covered/uncovered areas is set in the 2X2 block where the error is the double of the error 
calculated for the block on its left. The width of the covered/uncovered areas is known from 
what previously described in equation (5). Thus it is possible to extrapolated where are the 
covering/uncovering areas within the frame. 
15 Experiments have proven that the first method performs better than the second 

one. The operations count, when we consider a peak load of 10%, is comparable for the two 
methods, thus we would propose the first method as preferred embodiment. 

Once we have a clear classification of the areas in the interpolated frame as 
belonging to three distinct categories, i.e. present in both frames, really covered and really 
20 uncovered, and we know where the background is, what velocity it has, and where the 
foreground is and what velocity it has, we can design an ad hoc interpolation strategy. 

We now propose to use different interpolation strategies for the various regions 

categorized as described above. 

25 A first approach, the simplest one, will not reduce the visual artifacts in the 

occlusion areas compared to the previous method. However, it can provide a way to obtain a 
generally improved output with respect to classical methods such as the motion compensated 
3-taps median filtering, or the motion compensated averaging, applied on the entire frame. 
Moreover the operation count can be strongly reduced in comparison with what required with 

30 the median method, since the ordered statistical filtering is needed only for a portion of pixels 
in the frame that is generally not bigger then 10%. This method seems to be particularly 
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interesting for software implementation, e.g. on the Philips Trimedia processor (TM1000, 
TM2000), since it provide a quality which is better of that of a 'median' approach, with an 
operation count reduced to about 1/4 of that of the median method. 

This approach uses only the information on where the occlusion areas are, i.e. 
5 where significant edges in the motion vector field have been detected: 



F(x,n + tpos)= W 
med(F(x - tposD(x, n), n), Av, F(x + (1 - tpos)D(x, n), n + 1, (occlusion areas) 

1 v 1 ,+ \ ~* \ , i\ (otherwise) 

- F(x - tposD(x, n), n) + - F(x + (1 - tpos)D(x, w), n + 1), 

12 2 

i.e. we propose to use motion compensated 3 taps median filtering in occlusion 
areas and motion compensated averaging otherwise. 

10 If the goal is to have better interpolation, it seems best to interpolate the result 

from the previous field only, or mainly, in case of Veal covering' of the background (region 
TTTR in Fig. 3), whereas in case of 'real uncovering' of the background (region la in Fig. 3), the 
motion compensated data from the current field is preferred in the interpolation process. In all 
other cases, the motion compensated interpolator can use the data from both fields. A way to 

15 do this is described in the following equation: 



F{x,n + tpos) = 

med(F( X - tposD^ (3c, n), n), Av, Fix - tposb{x, n), n), (covering areas) 

med(F(x + (l-tposD mcov (x, n), n + 1), Av, F(x + (1 - tpos)D(x, n), n + 1), ( uncovenn S 

^ areas) 

- F(x - tposDix, n), n) + - Fix + (1 -tpos)Dix, n), n + 1), (otherwise) 
.2 2 



(9) 



This method provides an increased quality compared with the previous 
methods, and can also provide a reduced operation count, due to the fact that the more 
20 expensive operation (the median filter) will be applied only on a small portion of the frame. 

A preferred embodiment of the current invention is shown in the block diagram 
of Fig. 4. An input video signal I is applied to a motion estimator unit MEU having a motion 
vector estimator ME1. In the motion estimator unit MEU, the input video signal I is applied to 
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a field delay FM1, and to a first input of the first motion vector estimator ME1. An output 
signal of the field delay FM1 is applied to a second input of the motion vector estimator ME1 
thru a shifter SI. The motion vector estimator ME1 supplies motion vectors Df and 
corresponding motion estimation errors ef. 
5 The input video signal I is also applied to a motion-compensated interpolator 

MCI. In the motion-compensated interpolator MCI, the input video signal I is applied to a field 
delay FM2, and to a shifter S3. An output of the field delay FM2 is applied to a shifter S4. The 
shifters S3, S4 are controlled by the motion vectors Df received from the motion estimator unit 
MEU. Outputs of the shifters S3, S4 are applied to a median circuit med and to an average 

10 circuit Av. Outputs of the median circuit med and the average circuit Av are applied to a 

multiplexer MUX which supplies the output signal O to a display device CRT for displaying 
the output signal at, for example, a 100 Hz field rate. The motion vectors Df and their errors ef 
are applied to a region detector RD which furnishes a control signal to the multiplexer MUX. 
In accordance with the present invention, this region detector RD carries out an edge detection 

15 and determines the height and width of covered / uncovered areas as described above. 

Fig. 5 shows a region detector RD for use in the embodiment of Fig. 4. The 
motion vectors Df estimated by the motion estimation unit Df are applied to a delay unit DU to 
furnish left-hand motion vectors Dl and right-hand motion vectors Dr. These motion vectors 
20 Dl, Dr are applied to a subtraction unit operating in accordance with the equations (1) to (6) 
described above, to obtain control signals for the multiplexer MUX and the shifters S3, S4 of 
Fig. 4. 

In sum, the algorithm described in this disclosure aims first at localizing, in a 
25 robust and cost-effective way, covering/uncovering areas. In those areas motion compensated 
scan rate conversion can produce very strong and annoying artifacts. To reduce them several 
solutions have been proposed, depending on the target quality of the up-conversion and on the 
cost constraints. When the target is improving the cost effectiveness, an up-conversion 
strategy can be chosen, that provides a quality comparable to that of standard methods, with 
30 an-operations count reduced to ~ 1/3 of that of the median method. With a somewhat smaller 
gain (or a comparable effort) also the quality' can be improved. This method seems to be 
particularly interesting for software implementation, e.g. on the Philips Trimedia processor 
(TM1000-TM2000) but we believe that it could improve any up-conversion strategy. 

Salient features of the invention can be summarized as follows: 
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A method, and apparatus realizing this method, that locates 'occlusion (difficult) 
areas' in a frame, comprising: means to estimate a motion vector field, and characterized in 
that it exploits the output of an edge detector acting on the motion vector field. 

Such a method, and apparatus realizing this method, in which the detector 
5 signals locations in the picture, where the difference in x-component of the motion vector of 
horizontally neighboring vectors (or the difference in y-component of the motion vector of 
vertically neighboring vectors) exceeds a threshold, thus giving an indication on where the 
occlusion areas* are (without distinction between covered or uncovered areas). 

Such a method, and apparatus realizing this method, in which the difference 
10 plus the difference in signs of the aforementioned motion vectors give indication on where the 
covered areas and the uncovered areas are. 

Such a method, and apparatus realizing this method, in which interpolation 
means, for interpolating pictures in between existing ones, is adapted to the presence of 
difficult areas by using a motion compensation averaging or a plain shift over the motion 
15 vector of the nearest existing picture to generate the output interpolated picture in areas where 
the edge detector finds no discontinuities, and using an order statistical filter to interpolate 
picture parts in which the edge detector signals a discontinuity. 

Such a method, and apparatus realizing this method, in which interpolation 
means, for interpolating pictures in between existing ones, is adapted to the presence of 
20 covered or uncovered areas by using a motion compensation averaging or a plain shift over the 
motion vector of the nearest existing picture to generate the output interpolated picture in areas 
where the edge detector finds no discontinuities, and using mainly either of the two 
neighboring frames for interpolating the occlusion areas, depending if a covered area or an 

uncovered area has been detected. 
25 A method, and apparatus realizing this method, for interpolating pictures in 

between existing ones, comprising: means to estimate a motion vector field, and means to 

interpolate pictures from existing ones using this motion vector field, characterized in that the 

interpolation means adapt to the output of an edge detector acting on the motion vector field. 

Such a method, and apparatus realizing this method, in which the detector 
30 signals locations in the picture, where the difference in x-component of the motion vector of 

horizontally neighboring vectors (or the difference in y-component of the motion vector of 

vertically neighboring vectors) exceeds a threshold. 

Such a method, and apparatus realizing this method, in which the adaptation 

consists in using a motion-compensated average or a plain shift over the motion vector of the 
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nearest existing picture to generate the output interpolated picture in areas where the edge 
detector finds no discontinuities, and using an order statistical filter to interpolate picture parts 
in which the edge detector signals a discontinuity. 

Such a method, and apparatus realizing this method, in which the order 
5 statistical filter uses information from the previous picture shifted over the motion vector, 
information from the next picture shifted (backwards) over the motion vector, and a non- 
motion compensated average of the neighboring pictures, to calculate the output. 

Such a method, and apparatus realizing this method, in which the above- 
mentioned refinement is applied. 

10 

It should be noted that the above-mentioned embodiments illustrate rather than 
limit the invention, and that those skilled in the art will be able to design many alternative 
embodiments without departing from the scope of the appended claims. In the claims, any 
reference signs placed between parentheses shall not be construed as limiting the claim. The 
15 word "comprising" does not exclude the presence of other elements or steps than those listed 
in a claim. The expression "at" also includes the notion "near". The invention can be 
implemented by means of hardware comprising several distinct elements, and by means of a 
suitably programmed computer. In the device claim enumerating several means, several of 
these means can be embodied by one and the same item of hardware. 
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CLAIMS: 



1 A method of locating problem areas in an image signal (I), the method 

comprising: 

estimating (MEU) a motion vector field (Df) for said image signal (I); and 
detecting (RD) edges in the motion vector field (Df). 

5 

2. A method as claimed in claim 1, wherein said edges detecting step (RD) 
includes comparing (SU) motion vectors (Df) from mutually different spatial positions. 

3. a method according to claim 2, wherein said comparing step (SU) includes: 
10 determining absolute differences in motion vector components of two motion vectors (Df) 

corresponding to two spatially neighboring locations to detect edges in the motion vector field 
and a size of a covered or uncovered area; and 

determining differences in motion vector components of two motion vectors 
(Df) corresponding to said two spatially neighboring locations to determine whether there is 
15 covering or whether there is uncovering. 

4. a method as claimed in claim 1, wherein edge locations in successive field 
periods are compared to distinguish between foreground and background. 

20 5. A method (MEU, MCI) of interpolating images between existing images (I), the 

method comprising: 

estimating (MEU) a motion vector field for an image signal; 

detecting (RD) edges in the motion vector field; and 

interpolating (MCI) image parts in dependence upon a presence of edges. 
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6. A method as claimed in claim 5, wherein said interpolating step (MCI) includes 
using an order statistical filtering (med) at edges. 

7. A method as claimed in claim 5, further comprising: 



WO 00/1 1 863 PCT/EP99/05771 

13 

subdividing image blocks at edges into smaller blocks; 

using for each of the smaller blocks that motion vector (Df) among the motion 
vectors at opposite sides of an edge, which yields a lowest match error (ef). 

5 g. a device for locating problem areas in an image signal (I), the device 

comprising: 

means (MEU) for estimating a motion vector field (Df) for said image signal 

(1); and 

means (RD) for detecting edges in the motion vector field (Df). 

10 

9. A device (MEU, MCI) for interpolating images between existing images (I), the 
device comprising: 

means (MEU) for estimating a motion vector field (Df) for an image signal (I); 
means (RD) for detecting edges in the motion vector field (Df); and 
15 means for interpolating (MCI) image parts in dependence upon a presence of 

edges. 

10. An image display apparatus, comprising: 

a device (MEU, MCI) for interpolating images between existing images (I) as 

20 claimed in claim 9; and 

a display device (CRT) coupled to an output of said interpolating device (MEU, 

MCI). 
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